Sequence comparison of the mitochondrial genomes of five brackish water species of the family Neritidae: Phylogenetic implications and divergence time estimation

Abstract Neritids are ancient gastropod species which can live in marine, brackish water, and freshwater environments. In this study, we sequenced and annotated the mitochondrial genomes of five brackish water neritids (i.e., Clithon corona, Clithon lentiginosum, Clithon squarrosum, Neritina iris, and Septaria lineata). The mitogenomes ranged from 15,618 to 15,975 bp, and all contain 13 protein‐coding genes (PCGs), 22 tRNA genes, and two rRNA genes, with a closed ring structure. We calculated the Ka/Ks values of all 13 PCGs of Neritidae species, all ratios are less than 1, under purification selection. Phylogenetic analysis of the 13 PCGs showed that Neritimorpha is a sister group with Vetigastropoda and Caenogastopoda, genus Clithon is a sister group with Neritina and Septaria. Estimation of divergence time for all species of Neritidae showed that the main differentiation of Neritidae occurred in Cenozoic period (65 Mya), C. corona and C. lentiginosum were differentiated in the Cenozoic Neogene, the other three species diverged in the Cenozoic Paleogene. These results will help to better understand the evolutionary position of Neritidae and provide reference for further phylogenetic research on Neritidae species.


| INTRODUC TI ON
Neritidae (Gastropoda: Neritimorpha: Cycloneritida) is one of the most diverse taxa in the Neritimorpha (Rafinesque, 1815). At present, there are 16 genera, comprising around 280 species (Hamish et al., 2007), with about 40 species having been found on the southeast coast of China before 2008 (Zhang, 2008). The fossil record of neritids dates back to the late Cretaceous (Kano, 2002), showing ecological radiation and extreme diversity in form. Neritids occur mainly in intertidal zone (Sasaki et al., 2002). They are euryhaline, and can live in marine, brackish water, and freshwater ecosystems, Nerita species are almost exclusively found in marine environments, Clithon and Neritina animals are mostly found in freshwater and brackish water environments (Tan & Clements, 2008). There have been at least five or six evolutionary transitions from hypersaline environments to freshwater in the evolutionary history of Neritidae (Frey, 2010;Holthuis, 1995). However, most freshwater lineages retain a dispersed planktonic marine larval stage, in which adults develop, reproduce in rivers, hatch larvae enter the sea, grow into adults, and return to freshwater in a cycle (Abdou et al., 2015).
The metazoan mitochondrial genome (mitogenome) is a doublestranded molecular structure in the form of a closed ring. It usually has 37 coding genes, including 13 protein-coding genes (PCGs), two ribosomal RNA genes (rRNA), 22 transfer ribonucleic acid (tRNA) genes, and a noncoding control region (CR) (Fernández-Silva et al., 2003;Wolstenholme & David, 1992). The mitogenome is characterized by high conservation, lack of extensive recombination, maternal inheritance, and a high mutation rate (Curole & Kocher, 1999;William et al., 2004). Compared with some gene fragments, such as COI (cytochrome c oxidase subunit 1) and 16S rRNA, mitogenome sequences can better elucidate evolutionary relationships between species, it has been widely used in phylogenetic researches (Zardoya & Meyer, 1996).
Next-generation sequencing (NGS) have been widely used in phylogenetic analysis, and the study of Neritidae classification has been ongoing for a long period. However, there are insufficient studies on the mitochondrial data and divergence time of neritids. In this study, we chose five neritid species: Clithon corona, Clithon lentiginosum, Clithon squarrosum, Neritina iris, and Septaria lineata, which can live in both fresh and brackish water environments. After sequencing, assembly, annotation, and analysis of the complete mitogenome, we analyzed their basic characteristics in the five species, calculated the average nonsynonymous to synonymous substitution ratio (Ka/Ks) of 19 Neritidae species, constructed the phylogenetic tree of mitogenomes of Gastropoda to analyze the phylogenetic position and relationship in Neritidae, and speculated the differentiation time of neritids.
2 | MATREIAL S AND ME THODS 2.1 | Sample collection and DNA extraction Five species of Neritidae C. corona, N. iris, S. lineata, C. squarrosum, and C. lentiginosum were collected from the coastal area of Huizhou, Guangdong Province, China ( Table 1). The preliminary morphological identification of these samples was carried out by consulting the taxonomic experts of the Marine Biological Museum of Zhejiang Ocean University. Store samples in absolute ethanol, take a small piece of fresh foot tissue to extract total DNA by salting-out method (Aljanabi & Martinez, 1997), and store at −20°C.

| Mitogenome sequencing, assembly, and annotation
The complete mitogenomes of five species were sequenced on the Illumina Hiseq X Ten platform by Origingene Bio-pharm Technology Co., Ltd. (Shanghai, China). The Covaris M220 physical method (ultrasonic) was used to fragment the DNA, and the length of the fragments was 300−500 bp. Then, the DNA fragments were purified to construct a sequencing library. The Illumina HiSeq™ platform was used for sequencing after library quality inspection, and a 10 Gb data volume was used for sequencing. Data quality control was performed by Trimmomatic v0.39 (http://www.usade llab. org/cms/index.php?page=trimm omatic) (Bolger et al., 2014), filter out low-quality reads, duplicated reads, sequences with an "N" rate greater than 10%, and sequencing linker sequences. Clean data with high quality was obtained and the reads of the five species were de novo assembled using NOVOPlasty assembly software (https:// github.com/ndier ckx/NOVOP lasty) (Dierckxsens et al., 2017). The stack cluster was compared with reference genome in the GenBank database, and majority of the mitogenome sequence information was obtained. Then, the online software MITOS (http://mitos.bioinf.uni-leipz ig.de/index.py) was used for structural and functional annotation and manual correction (Bernt et al., 2013), the complete mitogenome was finally obtained. Sequenced mitogenomes were uploaded to GenBank database at the National Center for Biotechnology Information (NCBI).

| Phylogenetic analyses
The phylogenetic analyses of the five species were performed using the sequences of complete mitogenomes from 81 species (  (Xia, 2018), the PCGs of each sample were concatenated together in the same order, the tree building sequence set was obtained by combining them in a unified sequence. The PCGs sequences of these 81 species were aligned using ClustalW of MEGA-X. Nucleotide substitution saturation was analyzed using DAMBE 7 to evaluate whether these sequences were suitable for phylogenetic tree construction.
The Bayesian inference (BI) method of the program MrBayes 3.2.7a (Ronquist et al., 2012) and the maximum likelihood (ML) method of IQ-tree 2.1.3 (Minh et al., 2020) were used to analyze the phylogenetic relationships. The Bayesian method model measurement firstly used PAUP 4 (Swofford, 2002) software for format conversion, and then used MRMTGUI (Nuin, 2005) software to associate PAUP 4, ModelTest 3.7 (Posada, 2005) and MRModelTest 2.3 (Nylander et al., 2004) programs to determine the best alternative model under the Akaike information criterion (AIC) as GTR + I + G.
BI analysis was performed using two Markov chain Monte Carlo (MCMC) run with 2 million generations, and sampling was performed once every 1000 generations. the first 25% of trees were discarded as burn-in, and convergence for independent operation was evaluated using the mean standard deviation of the splitting frequency (<0.01).
The ML tree best fit replacement model (GTR + F + I + G4) selected by Bayesian information criterion (BIC) using ModelFinder (Kalyaanamoorthy et al., 2017), setting the boot copy number with 1000 ultra-fast bootstraps in order to reconstruct the consensus tree. Finally, the phylogenetic tree was viewed, edited, and visualized using the Figtree 1.4.4 (Rambaut, 2018) software. The analysis was performed using the BEAST 1.8.4 software (Drummond et al., 2012). We used an uncorrelated lognormal relaxed molecular clock model. The Yule process was used for the tree prior, and divergence time calibration was used for the dis- genes (seven PCGs and eight tRNA genes) are located on the heavy chain, while the others were located on the light chain ( Figure 1).

| Estimation of divergence times
The longest gene was ND5, with a length of 1702 to 1717 bp, and the shortest was the ATP8 gene, with a consistent length of only 165 bp (Table 3).
In the five mitogenomes at present study, the average AT content was higher than CG, with a bias of 64.90%. The average AT-skew was −0.0545, and GC-skew was 0.1486 ( Table 4). The base content of As was lower than that of Ts, and the base content of Gs was higher than that of Cs. In general, the average content of each species in TA B L E 2 List of Gastropoda species used in phylogenetic analysis with their GenBank accession numbers, and five newly sequenced Neritid species were marked with ✽ the complete mitogenome was T > A > G > C (Table 3), which is consistent with the reported complete neritids mitogenomes (Arquez et al., 2014;Feng et al., 2020Feng et al., , 2021.

| Protein-coding genes and codon usage
The mitogenome of the Neritidae in this study contains 13 PCGs, including a cytochrome b (Cyt b), two ATPases (ATP6 and ATP8), three cytochrome oxidases (COI-III), and seven NADH dehydrogenases (ND1-6 and ND4L). The length of the PCGs in these five species is between 11,054 and 11,140 bp ( Table 3). The base composition of these species also showed a high AT bias, with the highest AT content being seen in S. lineata, at 65.75%. The AT bias values of each species were negative, in addition to N. iris at −0.07, the values of the other four species are −0.05, with the T base content being higher than that of the A base. In these five neritid species, the start codon was ATN, almost all genes initiated with ATG, and only a few genes initiated with ATA ( supplemented during transcription to obtain a complete stop codon T(AA) (Ojala et al., 1981). In the complete mitogenome of the Neritidae, the control region (CR) is the largest noncoding region, and the mitochondrial CR of all neritid species in this study was located between tRNA-Glu and COIII, with a length of 527-891 bp (Table 6). This area usually presents a high AT bias, being an A + T rich area. This is an essential element involved in mitogenome replication and transcription initiation (Fernández-Silva et al., 2003).

| Ka/Ks
Ka/Ks has been used as an effective way to understand the dynamic evolution of protein-coding genes. Therefore, the Ka/Ks ratios of the 13 PCGs were calculated using the 19 sequenced Neritidae species in order to study the relationship between evolution and selection pressure (Figure 3). The results showed that the Ka/Ks ratios of the PCGs range from 0.053 for COI to 0.712 for ND6. COI has the lowest Ka/Ks value, suggesting that COI is under the lowest selective pressure to conserve the protein sequence. It is therefore widely used as a potential molecular marker in species identification and phylogenetic studies (Astrin et al., 2016).
In general, a gene is considered to be positively selected only when the Ka/Ks ratio is greater than 1. The majority of the 13 PCGs genes of the species involved in this study had relatively lower Ka/

TA B L E 3 (Continued)
Ks ratios, ratio is less than 1. Therefore, we suggest that these PCGs may be under the influence of purification selection.

| Phylogenetic relationships
The 13 PCGs of the mitogenome of 79 species from five subclasses of Gastropoda (Vetigastropoda, Caenogramopoda, Neritimorpha, Patellogramopoda, and Heterobranchia) and other two species as outgroups were used to construct phylogenetic trees ( Figure 4, Table 2). The result showed that the ML tree and BI tree have a consistent topological structure, therefore, only the topology of BI tree was displayed, with strong bootstrapping for the ML tree and posterior probability values.
Our phylogenetic analysis showed that Neritimorpha is closely   (Figure 5), which is close to previous studies (Feng et al., 2020(Feng et al., , 2021. The first divergence of the Neritimorpha was in the Triassic period, the first period of Mesozoic, which was the transition period involving the disappearance of Paleozoic biota and the formation of post-modern biota. During this period, the marine invertebrate fauna underwent great changes . In the Neritidae, the differentiation of the four genera occurred about 102.74 Mya, the results obtained from this analysis were slightly older than the age of the origin of the Spadonidae estimated in previous reports (Feng et al., 2020(Feng et al., , 2021. This may be due to differences

CO N FLI C T O F I NTE R E S T
All the authors declared no potential interest.

DATA AVA I L A B I L I T Y S TAT E M E N T
The following information was supplied regarding the availability